Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher.
Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?
Some links on this page may take you to non-federal websites. Their policies may differ from this site.
-
Data reliability and availability, and serviceability (RAS) of erasure-coded data centers are highly affected by data repair induced by node failures. In a traditional failure identification scheme, all chunks share the same identification time threshold, thus losing opportunities to further improve the RAS. To solve this problem, we propose RAFI, a novel risk-aware failure identification scheme. In RAFI, chunk failures in stripes experiencing different numbers of failed chunks are identified using different time thresholds. For those chunks in a high-risk stripe, a shorter identification time is adopted, thus improving the overall data reliability and availability. For those chunks in a low-risk stripe, a longer identification time is adopted, thus reducing the repair network traffic. Therefore, RAS can be improved simultaneously. We also propose three optimization techniques to reduce the additional overhead that RAFI imposes on management nodes' and to ensure that RAFI can work properly under large-scale clusters. We use simulation, emulation, and prototyping implementation to evaluate RAFI from multiple aspects. Simulation and prototype results prove the effectiveness and correctness of RAFI, and the performance improvement of the optimization techniques on RAFI is demonstrated by running the emulator.more » « less
-
null (Ed.)The Intel Optane DC Persistent Memory Module (AEP), which is the first commercial available Non-Volatile Memory (NVM) product, offers comparable performance with DRAM while providing larger capacities and data persistence. Existing researches that substitute NVM with DRAM or hybridize them are either emulator-based or focused on how to improve the energy efficiency for writes. Unfortunately, the energy efficiency of the real AEP system is less explored. Based on real AEP, we observe that even though eliminating the DRAM-like refresh energy consumptions, AEP consumes significant different energy at different performance levels. Specifically, requests with time intervals (dispersed) underperform in both performance and energy efficiency when compared with the case of requests without time intervals (compact). This disparity and parallelism exploitation potentials motivate us to propose Sprint-AEP, an energy-efficiency-oriented scheduling method for AEP-equipped servers. Sprint-AEP fully activates adequate AEPs to serve most of the requests by deferring the write requests and prefetching the hottest data. The remaining AEPs will stay in idle mode with a low idle power to save energy. Besides, we also utilize the read parallelism to accelerate the sync and prefetching processes. Compared with energy-unaware AEP usages, our experimental results show that Sprint-AEP saves up to 26% energy with little performance degradation.more » « less
An official website of the United States government

Full Text Available